Skip to content

feat(dataset): port dry-run and impact from Go CLI#16

Draft
beengud wants to merge 1 commit into
mainfrom
feat/dataset-impact
Draft

feat(dataset): port dry-run and impact from Go CLI#16
beengud wants to merge 1 commit into
mainfrom
feat/dataset-impact

Conversation

@beengud

@beengud beengud commented Jun 24, 2026

Copy link
Copy Markdown
Owner

Closes #5

Ported surface

Adds two subcommands to the existing dataset route (existing list/view untouched):

  • dataset dry-run <file.json> — Dry-runs saving a dataset pipeline; nothing is persisted. Prints the dataset that would be saved (Dataset: <name> (<id>)), datasets that would be rematerialized (Would rematerialize: <name> (<id>)), and compile/validation errors (Error in <name>: <text>). Exits 1 when any error datasets are returned.
  • dataset impact <file.json> — Renders a table (NAME / ID / IMPACT) of downstream datasets affected by saving the pipeline. Compile/validation errors go to stderr. Supports --format json|csv / --json. Nothing is persisted.

Input JSON shape (matches the Go CLI):

{ "workspaceId": "...", "dataset": { "name": "..." },
  "query": { "stageQueries": [{ "stageID": "...", "pipeline": "..." }] } }

Schema-vs-Go corrections (the Go source did not match the published SDL)

The Go fork queried fields that do not exist in the published schema; these were rewritten against the real SDL (terraform-provider-observe):

Go source assumed Published schema reality What I did
saveDatasetDryRun mutation No such mutation Use saveDataset(..., dependencyHandling: { saveMode: PreflightDatasetAndDependencies }) — the schema's preflight/dry-run mechanism (matches operation/dataset.graphql)
errorDatasets { dataset { id name } errorText } DatasetError { datasetId, datasetName, text } (no nested dataset, no errorText) Select datasetName / text
getDatasetsAffectedByDatasetUpdate { affectedDatasets { dataset { id name } dependencyType } } DatasetsAffectedByDatasetUpdateResult has only dematerializedDatasets, editForwardDematerializedDatasets, rematerializationCosts — no affectedDatasets/dependencyType Build the impact table from dematerializedDatasets + editForwardDematerializedDatasets (dataset { id name }); the IMPACT column reflects which list a row came from. dependencyType is not exposed by the schema and could not be reproduced
dataset: { name } mapped directly DatasetInput.label: String! (no name field) Map input dataset.nameDatasetInput.label
query.stageQueries[].stageID MultiStageQueryInput requires outputStage: String! + stages: [StageQueryInput!]! where the field is id (not stageID) and input: [InputDefinitionInput!]! is required Map stageIDid, set input: [], derive outputStage from the terminal stage

The impact command issues a second preflight saveDataset to obtain errorDatasets, since getDatasetsAffectedByDatasetUpdate does not return them.

Verification

  • Codegen (scoped to src/gql/dataset/**): ✔ Generate — both operations validate against the published SDL.
  • Typecheck (scoped tsconfig): no errors in src/commands/dataset or src/gql/dataset source files (pre-existing unrelated errors from missing rest/generated/other resources' types are out of scope).
  • Lint/format: eslint + prettier clean on all new/edited files.
  • Tests: bun test src/commands/dataset/{dry-run,impact,input}.test.ts14 pass / 0 fail, 32 assertions across 3 files.

🤖 Generated with Claude Code

Adds `dataset dry-run` and `dataset impact` subcommands to the existing
dataset route, ported from the deprecated Go Observe CLI.

- dry-run: previews saving a dataset pipeline via `saveDataset` with
  `dependencyHandling.saveMode = PreflightDatasetAndDependencies` (nothing
  persisted); reports the dataset, rematerialized datasets, and compile
  errors, exiting 1 if any error datasets are returned.
- impact: renders a table of downstream datasets affected by a save and
  sends compile/validation errors to stderr.

The Go source queried a `saveDatasetDryRun` mutation and
`affectedDatasets { dataset { id name } dependencyType }` field that do not
exist in the published GraphQL schema; the operations here were rewritten
against the real SDL.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Port dataset dry-run / dataset impact from the Go fork

2 participants